********************************************************************************
*** Data Paper (MIPD Release 2.0): Consequences of multiple quasi-responses
***
*** Created: 10/23/23
***
********************************************************************************


*** Load the data
use "MIPD -- Release 2.0.dta", clear

********************************************************************************
*** Generate the quasi-responses dataset
********************************************************************************

bys studyid: gen obsid = _n

keep studyid obsid fw_* survey* interview party_id age male urban white edulevel income_quartile mipcode*_r1

* Reshape the data
qui foreach v of numlist 1(1)22 {
	rename mipcode`v'_r1 mipcode`v'
}

reshape long mipcode, i(studyid obsid) j(quasi)

* Drop if it is missing that quasi-response
drop if mipcode == .

* Which observations have more than one quasi-responses?
bys studyid obsid: egen num_quasi = max(quasi)

* What percentage have different *specific* categories of quasi-responses?
tempvar diff
sort studyid obsid mipcode

bys studyid obsid: gen `diff' = 1 if mipcode != mipcode[_n-1]
bys studyid obsid: egen num_diff = total(`diff')

* What percentage have different *general* categories of quasi-responses?
tempvar diff_c cat

gen `cat' = round(mipcode, 100)
sort studyid obsid `cat'

bys studyid obsid: gen `diff_c' = 1 if `cat' != `cat'[_n-1]
bys studyid obsid: egen num_diff_c = total(`diff_c')

* What percentage have different *specific* categories between first and second quasi?
tempvar f s first second
gen `f' = 1 if quasi == 1
gen `s' = 1 if quasi == 2
	
bys studyid obsid: egen `first' = total(`f' * mipcode), miss
bys studyid obsid: egen `second' = total(`s' * mipcode), miss
	
gen diff_1and2 = 1 if `first' != `second'
recode diff_1and2 (.=0)
	
* What percentage have different general categories between first and second quasi?
tempvar cat first_c second_c
gen `cat' = round(mipcode, 100)
	
bys studyid obsid: egen `first_c' = total(`f' * `cat'), miss
bys studyid obsid: egen `second_c' = total(`s' * `cat'), miss	
	
gen diff_cat_1and2 = 1 if `first_c' != `second_c'
recode diff_cat_1and2 (.=0)
	
* Drop duplicates
sort studyid obsid quasi
duplicates drop studyid obsid, force

save "Multiple Responses.dta", replace
